-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a bootstrap command to auto-install packages #650
Conversation
a85749a
to
4d20855
Compare
57b9a53
to
4ecd50e
Compare
0bd266d
to
25bbf76
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocking comment on having hardcoded references to existing instrumentations. Also, please refrain from any single use variable or function, it makes it pretty hard to review code when you have to be jumping around the file.
instrumentations = { | ||
"dbapi": "opentelemetry-ext-dbapi>=0.6b0", | ||
"flask": "opentelemetry-ext-flask>=0.6b0", | ||
"grpc": "opentelemetry-ext-grpc>=0.6b0", | ||
"requests": "opentelemetry-ext-http-requests>=0.6b0", | ||
"mysql": "opentelemetry-ext-mysql>=0.6b0", | ||
"psycopg2": "opentelemetry-ext-psycopg2>=0.6b0", | ||
"pymongo": "opentelemetry-ext-pymongo>=0.6b0", | ||
"pymysql": "opentelemetry-ext-pymysql", | ||
"redis": "opentelemetry-ext-redis", | ||
"sqlalchemy": "opentelemetry-ext-sqlalchemy", | ||
"wsgi": "opentelemetry-ext-wsgi>=0.6b0", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not have hardcoded references to existing instrumentations. This list should be compiled from the available packages that implement the opentelemetry-instrumentor
entry point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to get this information at runtime or do you want to make it a build step and just automate generation of this list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this can help. Let me know if it does, please 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it does. This can help me get a list of installed instrumentations at runtime right?
This script is supposed to be used to install the required instrumentation packages. If we were to automate this, I think the way to do it would be to create a dev script that walks over the ext packages, figures out which ones are auto-instrumentation packages, then figures out which libraries they instrument and then generate a .py file for bootstrap.py to import and use. Even if we do that, in my experience working with a similar command for SignalFx's Python instrumentation, we might need to pin specific versions of different instrumentation in different versions of the bootstrap script so it might not always be feasible to auto-generate such a list without adding some "hard-coded" annotations to instrumentations or bootstrap script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a tricky problem. @owais can you file an issue to see if we can track a more automated way to do this, or just maybe include this in documentation for auto-instrumentation?
There's merit in hardcoding as well: we don't add magic auto-instrumentation until we've tested things sufficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) | ||
|
||
|
||
def _pip_check(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please refrain from single use functions? It forces the reader to look back in the file while reading this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think single use functions are good, when you are reading the code you find the function and you know what it is doing without having to understand how it is done. If you remove them, you'll end up with a very long function having to understand all the implementation details at once.
_OUTPUT_INSTALL = "install" | ||
_OUTPUT_REQUIREMENTS = "requirements" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move these module-level constants inside run
since they are only used there.
cmd = { | ||
_OUTPUT_INSTALL: _run_install, | ||
_OUTPUT_REQUIREMENTS: _run_requirments, | ||
}[args.output] | ||
cmd(_find_installed_libraries()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cmd = { | |
_OUTPUT_INSTALL: _run_install, | |
_OUTPUT_REQUIREMENTS: _run_requirments, | |
}[args.output] | |
cmd(_find_installed_libraries()) | |
{ | |
_OUTPUT_INSTALL: _run_install, | |
_OUTPUT_REQUIREMENTS: _run_requirments, | |
}[args.output](_find_installed_libraries()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that but it was just too much going on in a single line :)
return library in sys.modules or pkgutil.find_loader(library) is not None | ||
|
||
|
||
def _find_installed_libraries(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single use function here
pip_check = check_pipe.communicate()[0].decode() | ||
pip_check_lower = pip_check.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pip_check = check_pipe.communicate()[0].decode() | |
pip_check_lower = pip_check.lower() | |
pip_check_lower = check_pipe.communicate()[0].decode().lower() |
Single use functions were used because they are all system calls and can be more easily substituted if needed for example, can be mocked in tests. Making system calls inline makes mocking harder and less clear. I think it is worth it wrapping system calls in their own functions but happy to re-consider. |
Ok, making things easier to mock in tests is actually a good reason for single use functions, let's just try to use them only when strictly necessary 👍 |
opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py
Outdated
Show resolved
Hide resolved
opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py
Outdated
Show resolved
Hide resolved
opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py
Outdated
Show resolved
Hide resolved
opentelemetry-auto-instrumentation/src/opentelemetry/auto_instrumentation/bootstrap.py
Outdated
Show resolved
Hide resolved
"mysql": "opentelemetry-ext-mysql>=0.6b0", | ||
"psycopg2": "opentelemetry-ext-psycopg2>=0.6b0", | ||
"pymongo": "opentelemetry-ext-pymongo>=0.6b0", | ||
"pymysql": "opentelemetry-ext-pymysql", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why some of them are missing the version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They hadn't been published on pypi yet.
parser.add_argument( | ||
"-o", | ||
"--output", | ||
choices=[_OUTPUT_INSTALL, _OUTPUT_REQUIREMENTS], | ||
default=_OUTPUT_INSTALL, | ||
help=""" | ||
install - uses pip to install the new requirements using to the currently active site-package. | ||
requirements - prints out the new requirements to stdout. Output can be piped and appended to | ||
a requirements.txt file. | ||
""", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to have the intermediate --output
command?
opentelemetry-bootstrap -o install
looks weird to me.
What about removing it and having something like the following?
opentelemetry-bootstrap --install
opentelemetry-bootstrap --requirements
Btw, should a call without arguments (opentelemetry-bootstrap
) perform the installation? I got surprised by it, I run opentelemetry-bootstrap
to see what was it about and suddenly it was performing a lot of installations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed we might have a more actions in future like updating requirements.txt file in place or support for other tools like pipenv, poetry, etc. Would --action/-a
be a better name for the flag instead of --output/-o
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with --action/-a
instead and change the default to printing requirements.
OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency | ||
(e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade. | ||
Using --no-dependencies alone would leave potential for nonfunctional installations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you please mind wrapping these long lines?
logger.info( | ||
"Existing %s installation detected. Uninstalling.", package | ||
) | ||
_pip_uninstall(package) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if we should uninstall packages, or at least, if it should be the default behavior without a flag for that. I can imagine an use case where the user has a working setup but he's missing the instrumentation for some library, he tries opentelemetry-bootstrap
and boom, all his instrumentation packages are updated, this could break this setup.
I think in this case we should just print a message telling the user that the instrumentation is already installed and not actions will be performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also noticed that this logic is uninstalling the instrumentation even if the version to be installed is the same one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason to uninstall is that pip does not update packages with pip install
for some sources github being one. So if a dependency installs directly from github, pip install
does not update to newest version. We are not referencing any packages to be installed directly from github right now and not sure if at least the production ever will. We can remove this if everyone thinks we'll always be referencing pypi packages only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
he tries opentelemetry-bootstrap and boom, all his instrumentation packages are updated, this could break this setup.
Well, the intention of the command is to install/update the dependencies. We can make dry-run the default action for the command and print the dependencies, and force users to issue --install
flag to actually make changes to site-packages like you suggested in another comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made printing the requirements the default action. LMK what you think about that.
) | ||
|
||
|
||
def _pip_check(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think single use functions are good, when you are reading the code you find the function and you know what it is doing without having to understand how it is done. If you remove them, you'll end up with a very long function having to understand all the implementation details at once.
OpenTelemetry auto-instrumentation packages often have traced libraries as instrumentation dependency | ||
(e.g. flask for opentelemetry-ext-flask), so using -I on library could cause likely undesired Flask upgrade. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to clarify it in a broader way. Do we want instrumentations to have install dependencies on the instrumented libraries?
If a user does pip install opentelemetry-ext-flask
, should it update/install the Flask version installed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think they should. It causes more problems than it solved but it is what we do right now. Examples https://github.com/open-telemetry/opentelemetry-python/blob/master/ext/opentelemetry-ext-flask/setup.cfg#L43 https://github.com/open-telemetry/opentelemetry-python/blob/master/ext/opentelemetry-ext-django/setup.cfg#L41
There is a discussion regarding hardcoding, hardcoding may end up being the preferred approach. Dismissing to unblock this while the discussion is ongoing.
# relevant instrumentors and tracers to uninstall and check for conflicts for target libraries | ||
libraries = { | ||
"dbapi": ("opentelemetry-ext-dbapi",), | ||
"flask": ("opentelemetry-ext-flask",), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that opentelemetry-ext-django
is merged, can you add django
as well?
|
||
_ACTION_INSTALL = "install" | ||
_ACTION_REQUIREMENTS = "requirements" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be preferred to have a description of what this module does here as well as in the README so users will know what is going on and being installed automatically.
8c07b89
to
d016505
Compare
Thanks @ocelotl. @open-telemetry/python-approvers PTAL. I've fixed issues raised by most of the comments or replied to them. Working on add docs for the new command now. |
dd8784f
to
78784bb
Compare
Took care of the remaining comments and updated autoinstrumentation package docs. PTAL. |
d181b78
to
ab53d36
Compare
"dbapi": "opentelemetry-ext-dbapi>=0.8.8", | ||
"django": "opentelemetry-ext-django>=0.8.8", | ||
"flask": "opentelemetry-ext-flask>=0.8.8", | ||
"grpc": "opentelemetry-ext-grpc>=0.8.8", | ||
"requests": "opentelemetry-ext-requests>=0.8.8", | ||
"jinja2": "opentelemetry-ext-jinja2>=0.8.8", | ||
"mysql": "opentelemetry-ext-mysql>=0.8.8", | ||
"psycopg2": "opentelemetry-ext-psycopg2>=0.8.8", | ||
"pymongo": "opentelemetry-ext-pymongo>=0.8.8", | ||
"pymysql": "opentelemetry-ext-pymysql>=0.8.8", | ||
"redis": "opentelemetry-ext-redis>=0.8.8", | ||
"sqlalchemy": "opentelemetry-ext-sqlalchemy>=0.8.8", | ||
"wsgi": "opentelemetry-ext-wsgi>=0.8.8", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the version numbers correct here? I would expect them to be 0.8b0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea how they crept in. I think search/replace command messed them up. Fixed now. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still seeing 0.8.8 unless im missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad. Should be fixed now. The push didn't go through last time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just blocking on the question of versions, otherwise this looks great!
0397b54
to
1823ff2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
@owais please update the changelog, then i'll merge it |
This commit introduces a new boostrap command that is shipped as part of the opentelemetry-auto-instrumentation package. The command detects installed libraries and installs the relevant auto-instrumentation packages.
Updated. @codeboten |
This commit introduces a new boostrap command that is shipped as part of
the opentelemetry-auto-instrumentation package. The command detects
installed libraries and installs the relevant auto-instrumentation
packages.